kraco.dev

Sign In
Optimizing large-scale SQL queries for telecommunications database

Problem Statement

A report query is not completing as expected. The underlying query is as follows:

SQL
1SELECT d.user_identifier,
2 SUM(
3 CASE
4 WHEN CHAR_LENGTH(l.message) < 159 THEN 1
5 ELSE CEIL(CHAR_LENGTH(l.message) / 150)
6 END
7 ) AS msg_blocks
8FROM log l
9JOIN users u ON u.user_id = l.user_id
10JOIN dashboard d ON d.dashboard_id = u.dashboard_id
11WHERE (d.dashboard_id > 69 AND d.dashboard_id < 118)
12 AND l.timestamp >= '2025-01-01 00:00:00'
13 AND l.reply_message = 1
14GROUP BY d.user_identifier;
15

Problem Replication

Confirmed that the original query hangs for more than 100 seconds in the database management tool.

Troubleshooting

Suspected that the CASE and CHAR_LENGTH() functions might be causing full table scans. A slight refactor was attempted:

SQL
1SELECT d.user_identifier,
2 SUM(
3 CASE
4 WHEN CHAR_LENGTH(l.message) < 159 THEN 1
5 ELSE CEIL(CHAR_LENGTH(l.message) / 150)
6 END
7 ) AS msg_blocks
8FROM log l
9JOIN users u ON u.user_id = l.user_id
10JOIN dashboard d ON d.dashboard_id = u.dashboard_id
11WHERE (d.dashboard_id > 69 AND d.dashboard_id < 118)
12 AND l.timestamp >= '2025-01-01 00:00:00'
13 AND l.reply_message = 1
14GROUP BY d.user_identifier;
15

There was no improvement in performance; execution still exceeded 100 seconds.

Isolating the Issue with a Simplified Query

SQL
1SELECT
2 l.user_id,
3 SUM(
4 CASE
5 WHEN CHAR_LENGTH(l.message) < 159 THEN 1
6 ELSE CEIL(CHAR_LENGTH(l.message) / 150)
7 END
8 ) AS msg_blocks
9FROM log l
10WHERE l.timestamp >= '2025-01-01 00:00:00'
11 AND l.reply_message = 1
12GROUP BY l.user_id;
13

Even without joins, the query still takes over 100 seconds.

Indexing Investigation

Checked the number of rows scanned:

SQL
1SELECT COUNT(l.user_id) AS record_count
2FROM log l
3WHERE l.timestamp >= '2025-01-01 00:00:00'
4 AND l.reply_message = 1;
5

Returned approximately 55,000 rows but still took more than 10 seconds, indicating potential indexing issues.

Index Addition

SQL
1ALTER TABLE log ADD INDEX idx_log_reply_timestamp (reply_message, timestamp);
2

This improved count queries, but the main SUM query still showed no significant improvement. Precomputing results using temporary tables or CTEs was considered but would not address the root cause.

Checking for Table Fragmentation

SQL
1SELECT
2 TABLE_NAME,
3 ENGINE,
4 DATA_LENGTH,
5 INDEX_LENGTH,
6 DATA_FREE
7FROM information_schema.tables
8WHERE TABLE_NAME = 'log';
9

Results showed significant fragmentation, with approximately 30% of the table's size marked as free space. Rebuilding the table was considered.

Table Rebuild Using Online Schema Change

Backup

BASH
1sudo mysqldump -u root -p --single-transaction --quick --triggers database_name log | sudo tee /home/backups/mysql/log_backup_$(date +\%F).sql > /dev/null
2

Checked the backup with:

BASH
1head -n 20 /home/backups/mysql/log_backup_$(date +\%F).sql
2

Running pt-online-schema-change

BASH
1pt-online-schema-change --execute --alter "ENGINE=InnoDB" \
2 --user=root --ask-pass \
3 --socket=/var/run/mysqld/mysqld.sock \
4 --alter-foreign-keys-method=auto D=database_name,t=log
5

The rebuild took approximately 30 minutes.

Solution Testing

Retested the query post-rebuild:

SQL
1SELECT l.user_id,
2 SUM(
3 CASE
4 WHEN CHAR_LENGTH(l.message) < 159 THEN 1
5 ELSE CEIL(CHAR_LENGTH(l.message) / 150)
6 END
7 ) AS msg_blocks
8FROM log l
9WHERE l.timestamp >= '2025-01-01 00:00:00'
10 AND l.reply_message = 1
11GROUP BY l.user_id;
12

Performance improved but issues persisted for queries spanning dates beyond mid-January.

Additional Index Optimization

Added a composite index incorporating user_id:

SQL
1ALTER TABLE log ADD INDEX idx_log_reply_timestamp_user (reply_message, timestamp, user_id);
2

This improved performance for mid-range queries but did not resolve issues for broader date ranges.

Execution Plan Issues and Index Forcing

Using EXPLAIN, MySQL reported an optimal execution plan, but the actual execution time did not reflect this. Forcing the index manually resolved the issue:

SQL
1SELECT l.user_id,
2 SUM(
3 CASE
4 WHEN CHAR_LENGTH(l.message) < 159 THEN 1
5 ELSE CEIL(CHAR_LENGTH(l.message) / 150)
6 END
7 ) AS msg_blocks
8FROM log l FORCE INDEX (idx_log_reply_timestamp_user)
9WHERE l.timestamp >= '2025-01-01 00:00:00'
10 AND l.reply_message = 1
11GROUP BY l.user_id;
12

Recommendations

Recreate Old Indexes – If MySQL is misreporting index usage, dropping and recreating indexes may help.

Partitioning – The log table should be partitioned by timestamp to optimize time-based queries.

Automate Table Rebuilds – Periodic rebuilds using pt-online-schema-change via a cron job could prevent fragmentation issues.

Index Optimization Monitoring – Regularly check query plans and adjust indexes to ensure optimal performance.